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Abstract. As a contribution to higher dimensional spatial data mod- 
elling this article introduces a novel approach to spatial database de- 
sign. Instead of extending the canonical Solid-Face-Edge- Vertex schema 
of topological data, these classes are replaced altogether by a common 
type SpatialEntity, and the individual "bounded-by" relations between 
two consecutive classes are replaced by one separate binary relation 
BoundedBy on SpatialEntity. That relation defines a so-called Alexan- 
drov topology on SpatialEntity and thus exposes the fundamental math- 
ematical principles of spatial data design. This has important conse- 
quences: First, a formal definition of topological "dimension" for spatial 
data can be given. Second, every topology for data of arbitrary dimension 
has such a simple representation. Also, version histories have a canon- 
ical Alexandrov topology, and generalisations can be consistently mod- 
elled by the new consistency rule continuous functions between LoDs, 
and monotonicity enables accelerated path queries. The result is a rela- 
tional database schema for spatial data of dimension 6 and more which 
seamlessly integrates 4D space-time, levels of details and version history. 
Topological constructions enable queries across these different aspects. 



1 Introduction 

2D and 3D spatial models are well established for spatial data modelling, and 
there exist standards like CityGML in geo-spatial modelling and IFC for archi- 
tectural models. Currently there is active research on spatio-temporal queries 
PQ as well as 4D spatio-temporal modelling [2J, and also on considering other 
aspects like scale [5] as additional dimensions of spatial data. Also integrating 
the versioning graph, by itself ID, pushes up the dimension upper bound. 

Besides that, research on nD modelling provides generic spatial data models 
without a fixed dimension upper bound 4 and often gives a formal definition 
of "topological dimension" of spatial data. Topology has its own sub-discipline 
called "dimension theory" [5] where the possible definitions of such "topologi- 
cal dimension" are investigated. Among these, the Krull dimension [SJ p. 5] is 
particularly applicable for topological data and is proposed here as a standard 
definition of spatial data dimension. Throughout this article "dimension" of spa- 
tial data is the Krull dimension of the topological space established by the data 
entities. 
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Challenged by a scientist's request to show that there are applications for 
data of dimension beyond 4, and inspired by VAN Oosterom's ideas [7], this ar- 
ticle demonstrates the mathematical foundations of generic nD spatial modelling 
and its use for combining 3D spatial data, time, scale, and versioning into an 
integrated 6D+ model. In particular, it will be shown that sensible integration 
of scale increases the dimension of spatial data by more than one, hence the "+" 
in 6D+. Even more, further increasing the dimension can decrease the complex- 
ity of the data model, which renders statements like "nD modelling leads to a 
combinatorial explosion of complexity" wrong. 

The reader is assumed familiar with basic concepts of mathematical topology. 
To facilitate the lecture, references are given to locations in textbooks explain- 
ing the applied concepts in more detail. As the main intent is to expose the 
mathematics of nD modelling, the relational model, formulated by Codd [8], 
is used. Its sound mathematical basis alleviates its topologising in an elegant 
way [9|, and the principles described here are also applicable to object oriented 
modelling. They also serve as a basis for a generic topological database model as 
an extension of the relational model: instead of tables, there are spaces on which 
queries operate and which they return as a result. The hrst author of this arti- 
cle has implemented an experimental prototype of such a topological relational 
database model which runs on http : //pavel . gik . kit . edu 



2 Dimension 

In 3D spatial data usually four kinds of topological entities are represented in 
computer storage: Vertices are zero-dimensional discrete points in R 3 . Edges are 
one-dimensional manifolds which are the interior of paths starting at one vertex 
and ending at another vertex. In general, an edge is bounded by two vertices. 
Faces are two-dimensional manifolds enclosed by at least one loop of edges. Some 
models allow to specify more loops where one loop is the outer boundary and 
each additional loop is a hole boundary. Topologically, there is no difference 
between an "outer" boundary loop and a "hole" boundary loop because every 
hole can be made an outer boundary by stretching it wide enough and then 
nipping the face over. To wrap this up, a face is bounded by edges. Solids are 
three-dimensional manifolds enclosed by a set of faces which constitute a cavity 
within which the solid resides. Such cavity is often called a shell and, again, a 
solid is bounded by faces. Most 3D models establish a "chain" of four classes 
with two consecutive classes connected by a "bounded-by" association (cf. [10] 
for an overview). Note that the chain length equals the model dimension. 

According to 3D spatial models, time can be modelled by the real line R. 
Each moment in time can be represented by a real number, thus resembling 
a vertex in R 3 . A time span is an open interval (tijta) bounded by a starting 
point ti and an ending point ti, thus resembling an edge. This gives two classes 
of temporal entities: Moment and Timespan where each one-dimensional time- 
span is bounded by two zero-dimensional moments. According to spatial data 
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this "bounded by" association can be considered a special case of an association 
chain of length 1 which, again, corresponds with the dimension. 

When data is changed over time several versions of the data exist. In ver- 
sioning software two consecutive versions along a version history are commonly 
connected by an edge. Versions can also fork and merge and so a version history 
is a directed acyclic graph (DAG) with two classes, a Version and a Transition 
and each transition is bounded by an initial and a terminal version. So we could 
say that the "dimension" of a version history is one: there is only one association 
from Transition to Version. We will later take an alternative view on the version 
history which results in a different dimension. 

Spatial data is often organised hierarchically at different levels of "scale" 
or "detail" (LoD). For smooth transitions between LoDs it is often proposed 
to interpolate between consecutive LoDs by continuous morphing [7j. Then the 
space between two consecutive LoDs is considered an edge bounded by two LoDs. 
So, we have a one-dimensional space. 

Thus we have four types of spaces which we might call "elementary spaces" : 
The "spatial" space which is the Euclidean 3D, the one-dimensional Euclidean 
temporal space, a one-dimensional version space, and another one-dimensional 
scale space which is essentially a linear graph •••—>••—>••—>••••. 

Now a combination of these elementary spaces can be used to get higher- 
dimensional "combined spaces" . In such a combined space a 3D Solid s and a 
ID Timespan ts can be combined to a 4D pair (s,ts) which represents a 4D 
entity in space-time: the trajectory of s during ts. But for a zero-dimensional 
Moment in time t, the pair (s, t) is a 3D element in space-time representing that 
solid s at the moment t. When ts is bounded by t then the 4D-entity (s, ts) is 
bounded by a 3D-entity (s, t) and for each lower-dimensional nD element x in 
the bounded-by association chain of s there is a pair (x, t) in a corresponding 
association chain of (s, t). This means that in the 4D model the length of the 
association chains has increased by 1. Within that model each pair (a, b) satisfies 
the dimension formula dim(a, b) — dim a + dim b. 

As every database management system (DBMS) can only model finite sets we 
need a definition of "dimension" for finite spaces. As seen above, our nD spaces 
consist of entities, possibly distributed over several classes, and a bounded-by 
relation of chain length n. Hence "dimension" of spatial data is the maximal 
length of a chain of entities such that each is bounded by a consecutive ele- 
ment. We call this the combinatorial dimension of spatial data. The note 
proves that this combinatorial dimension is equivalent to the topological Krull 
dimension ;6, p. 5]. Note that even a simple sequence (n, n — 1, . . . , 0) can be 
considered a set {n, n — 1, . . . , 0} with "association" a S b 4^ a — 1 = b which 
then has a combinatorial dimension of n and obviously does not suffer from any 
"combinatorial explosion of complexity" whatsoever. We can even decrease the 
complexity of spatial data by increasing its dimension. 
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3 Spatial Dimension and Consistency 

Usually, in a DBMS consistency rules are design tools, and it lies at the dis- 
cretion of the user to make use of them. However, in spatial data modelling, 
"topological consistency" is often mentioned, but the rules to tell "consistent" 
from "inconsistent" spaces vary in the literature. We see two reasons: First, as 
mentioned above, the user should be entitled to decide which spatial data he 
considers "consistent" and which he does not. Second, the term "topological 
(in)consistency" is unknown in mathematics. Topology provides a rich set of 
well-defined topological properties which may or may not be used for a partic- 
ular application as a consistency rule. We will discuss here some "consistency" 
for nD spatial entities and present applications where these rules do not apply. 

A vertex is usually composed of an id, the coordinates, and additional at- 
tributes. The consistency rule is that a vertex is not bounded by another object. 
However, this property can be used to define vertex: In a topological space a 
vertex is an element that is not bounded by another element. A practical ap- 
plication is the room connectivity graph of a building: In this graph the rooms 
are the OD vertices, and the doors are their connecting ID edges. Each door is 
then "bounded by" its two rooms. This possibility of "shifting down" the dimen- 
sion and inverting it by "flipping" the boundary relation is a characteristic of the 
topological spaces occurring in spatial data models. That "flipping" is intimately 
related to the Poincare duality with which it should not be confounded. 

Some spatial data models define faces by a cycle of vertices around that face. 
This gives implicit edges between two consecutive vertices in the cycle. But we 
will only consider models with explicitly given edges. Then an edge usually has 
two references to two boundary vertices. When we weaken that rule of exactly 
two vertices we can define "edge" as a spatial entity with non-empty boundary 
that consists only of vertices. This allows selection results where the selection 
predicate only applies to, say, two edges but only one common connecting vertex. 
It would then be unwise to forget the connectivity information in the selection 
result only because it fails to satisfy the global constraints. Moreover, if an edge 
is allowed to have more than one vertex, then the data model allows hypergraphs, 
an abstract topological structure often used in practice. Additionally, an edge 
with four vertices might be considered an edge with a hole that occurs e.g. in 
intersections with non-convex faces. 

The classical consistency rule for faces is that the boundary edges must form 
loops. Mathematically, this means that a boundary must form a cycle, which 
is the fundamental property of a chain complex from algebraic topology |12j . 
However, it makes sense in general to permit query results violating consistency 
rules that have been set up for the underlying query input spaces. 

Another consistency rule is hard-coded into the classical sequential class 
schema: The only possible association between a face and one of its bound- 
ary vertices is indirectly via an intermediate edge. The schema does not allow a 
direct face-vertex association that "bypasses" the edge class. However, a vertex 
within a face may be considered, say, a collapsed interior face, e.g. a city in a 
region at a lower LoD. If we now specify a superclass SpatialEntity of Solid, Face, 
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Edge, and Vertex and replace the associations between two consecutive classes 
by one association of that class to itself, that rule can be made optional, too. 

Making the above consistency constraints optional immediately leads to a 
directed acyclic graph (DAG) of spatial entities. A DAG defines a partial order- 
ing on its elements, and since 1937 it is well-known that partial orderings are 
essentially the same as the so-called Tq Alexandrov topologies TS\, or, to put it 
short, spatial data models are topological spaces. We will now provide an initial 
relational database schema for 3D spatial data based on this observation: 

X(id, attributes), R( ida , idb ), (ida) — > X, (idb) X (1) 
Vertex(vid, x : R, y : R, z : R), (vid) ^ X . 

Table X contains the spatial entities, and table R specifies the "bounded-by" 
relation. Primary key attributes are underlined. Attribute attributes is a place- 
holder for the set of "semantic" , i.e. non-spatial attributes. The only consistency 
rule here is that R should be acyclic. Although every relation R defines an 
Alexandrov topology T(R) of the space 

(A, T(R)), T(R) ■■= {A c X V(o, b) e R:b e A^ a e A} 

that space is T -separable iff R is acyclic [14, Ex. 133] . T -separability is a sensible 
consistency rule for spatial data, and will be maintained here. Note, however, 
that non-separable spaces also are valid topological spaces, and so from the 
mathematical viewpoint acyclicity is an optional consistency rule, too. For each 
topology-defining relation R on X we assume the existence of a view on its 
pre-order R* which is the transitive and reflexive closure of R: 

create view poR as 

with recursive Rs t ( ida , idb ) as ( 

select id as ida, id as idb from X 
union 

select Rst.ida, R.idb 

from Rst join R on {Rst. idb = R. ida)) 
select * from Rst; 

This schema allows spatial data of arbitrary dimension, but for the moment we 
assume that R chains have length < 3, so the topological dimension matches the 
"geometric dimension", i.e. the number of coordinate attributes of Vertex. Also, 
the id of all vertices should occur in table Vertex. For obvious reasons, we call 
a pair (X, R) consisting of a set X and a binary relation fionla topological 
data type, or sometimes simply space. 

4 Temporal Dimension 

Here, we establish the 4D space-time which models changes of 3D space over 
time. For illustrative reasons, a 2D space-time of a ID "building" example in 
Lineland [TSJ ch. 13] changing over ID time will be discussed first. Assume in 
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Fig. 1. The process of a ID house in Lineland constructed at time to, extended by 
a portion X at time t\, and demolished at time t 2 , is modelled as a 2D space-time 
complex in two steps. The middle complex repeats each element at a given point or 
period of time, and the lower complex identifies some elements, the left wall Wi and 
some elements of the interior I, to get fewer entities to store. 



ID Lineland a "house" with interior / and boundary walls wi to the left and w r 
to the right as depicted on top of Fig. [TJ The boundary of I are the vertices Wi 
and w r . The house is erected at time to, modelled by "tagging" /, u>i and w r 
with to giving the pairs (I, to), (w;,to), and (uv,to). The spatial "bounded-by" 
associations from / to wi and from / to w r are carried over as tagged space- 
time associations from (I, to) to (wi,to) and from (I, to) to (u'r, to). The interior 
/, existing over a time-span so,i, is also tagged, giving its trajectory (7, sq,i)- 
This element is bounded in the horizontal (i.e. "spatial") direction by (wi,so,i) 
because I is bounded by w±. Here the "tag" So,i does not change in this boundary 
association. The element (I, so,i) is also bounded in the vertical (i.e. "temporal" ) 
direction by (I, to). Here the boundary association fixes element I and is taken 
from the boundary association between s ,i and to- 

The space at time point t\ models the before-after change derived by a ID 
overlay of the spatial model of I before, and the spatial model of J after the 
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change. Each overlay entity has a reference to its corresponding entity of the in- 
put spaces. Mathematically, these are two topologically continuous partial func- 
tions p (like "past" ) and / (like "future" ) from the overlay space back to the two 
input spaces. Such functions are called attaching maps. Now we can topologi- 
cally paste [TH Ch. 3 §7] the overlay onto the "past" entities by specifying that 
a tagged image element (I, so,i) is bounded by an element (x, t\) if the reference 
p(x) = I holds. We can also do that attachment onto the "future" by specifying 
that (wi, so,i) is bounded by (wi, t\) because p{w\) — w\. Interestingly (w r , t{) is 
a boundary element of the past wall trajectory (w ri so.i) because of p(w r ) — w r , 
but it is a boundary element of the future interior trajectory (J, si^) because of 
f(w r ) = J. 

This step is only an intermediate formal step which would create a lot of 
redundant data when implemented explicitly. To avoid redundancy a sequence 
of a tagged spatial entity not changing over time can be collapsed into one. 
Additionally, the user may specify further entities that, though having changed 
over time, are considered "identical" before and after the change. In our example 
this identification is carried out on (wi,so,i), (wi,t±), and (wiySi^), and it 
collapses (I, sq,i), (I,ti), and (J, si^) to (U,soi,2)- Mathematically, this gives 
a topological "quotient space" [HI Ch. 3] depicted in the lower part of Fig. Q] 
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Fig. 2. Directly combining the classical sequential space model with the same classical 
approach for a temporal model creates eight classes for time-space entities. 



The same construction applies to 3D spatial models to get a 4D space-time 
which is simply another topological space. However, when the 3D model is rep- 
resented by a chain of three associations, and when the temporal model also has 
that classical layout, we get a "grid" of eight different classes of pairs of spatial- 
temporal entities (cf. Fig. [5]). This results in ten different space-time-boundary 
associations, represented by lines between classes. On the other hand, if we 
had only two classes SpatialEntity and TemporalEntity, each with a relation 
BoundedBy, then there would be only one spatio-temporal class where each entity 
consists of a pair (spatial, temporal), and only one boundary relation associat- 
ing each pair (spatial , temporal) with all (.ds, temporal) for every boundary 
element ds of spatial and, dually, with the pairs (spatial , dt) for every bound- 
ary time point dt of temporal. This yields the topology of the so-called product 
space |16[ Ch. I §3]. The class SpaceTimeEntity with a relation BoundedBy (cf. 
Fig. [3]) allows to model arbitrary space-time configurations. When we consider 
id a surrogate key for pairs of space-time-entities the relational schema fl) for 
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Fig. 3. The one space-time class and bounded-by association obtained by combining 
the spatial and temporal models in a non-classical way. 



3D spatial data can be easily extended to cover spatio-temporal data: 

Vertex(vid, x : R, y : R, z : R, t : K) . 

Note the simplicity. Like [T], it allows moving objects between two time points 
that may be geometrically interpolated by a query. A simple topological SQL- 
query for the space at a given time point t with no interpolation is: 

create view Xt as 

with mmXiid, train, tmax) as ( 

select X.id, min(7. t) as tmin , max(V.t) as tmax 

from {X join poR on (X. id = poR. ida)) 

join Vertex V on (poR.idb = V.vid) 

group by X. id) 
select X.* from X join mmX on {X . id = mmX . id) 
where ( mmX . tmin < t and t < mmX . tmax) 

or ( mmX . tmin = t and t = mmX . tmax) ; 

The sub-query mmX first associates each element X .id with its topological clo- 
sure c\(X .id) by joining it to poR. The join with Vertex selects all vertices of 
c\{X .id). The group-by clause then computes the time interval for X .id. 

This query selects a subset Xt of X. As the pair (X, R) defines a topologi- 
cal space (X,T(R)), there should be a relation Rt that generates the subspace 
topology T(R)\xt (cf. [16, Ch. I §3]). Simply restricting R to Xt is generally 
wrong [5]. Naively restricting poR to Xt is correct but expensive. An optimal Rt 
is achieved by passing that restriction to Codd's OPEN operator which returns a 
minimal relation whose transitive closure is the same as that of the input relation 
[171 p. 427]. Hence Rt := DPEN({(a, b) e poR a ^ fe, a e Xt,b e Xt}). 

Just as the set Xt is a (9-selection of X, the space (Xt, Rt) can be considered 
a topological (9-selection of (X, R) , hence a "topologised" basic relational query 
operator. The topology T(Rt) is the minimal topology for which the inclusion 
function i : Xt — > X is continuous [TH Ex. 1] . A relationally complete topological 
database query language is given in [5] . A first simple prototype implementation 
can be found under http://pavel.gik.kit.edu/. 

It is also noteworthy that an edge in X, representing the trajectory of a 
vertex v from to to t\ with to < t < t%, topologically becomes a vertex in the 
result space Xt which does not contain the edge's space-time- vertices. 
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5 Versioning 

The version graph is a DAG whose vertices resp. edges correspond to the different 
versions resp. modifications of a space. An edge indicates that version i has 
been modified to obtain version j. Clearly, not all versions need to be stored 
explicitly. It suffices to store the initial version, and any other version v can 
be reconstructed from it by applying all modifications on the paths to v in the 
version graph. It is also advisable to redunantly store more versions e.g the 
current version but this leads to the problem of balancing redundancy avoidance 
against robustness and speed and we will not delve into this matter. Note that 
we only consider topological changes here. 

The possible topological modifications of a pair (A, R) are: First, a point 
can be added, and a point x e A can be removed. The latter forces relation 
R to be modified by removing all pairs (y,x) and (x,z) with y,z € A. But if 
y, z are different from x with (y, x) € R and (a;, z) <E R, then the pair (x, z) 
must be added to R if it is not already contained in R. The reason is that the 
modified set A \ {x} should be a subspace of (A, R), meaning that the (possibly 
indirect) bounded-by association y — > z in X must be retained in A \ {x} by 
the induced bounded-by relation. Further, a pair of points (either new or not 
removed from A) can be added to R, and a pair can be removed from R. These 
are the elementary modifications, and a general modification is a sequence of 
elementary modifications. 

The following relation schema: 

FK FK FK 

A (id, version, atts), version — > VX, R(ida, idb), ida — > X, idb — >X 

stores the versions of a space. The version-attribute gives the version in which 
an object or bounded-by association appears for the first time, and VX is the 
object table of the version space described below. This yields a single space 
containing all objects appearing in some version. The following schema stores 
when (i.e. in which versions) an object or bounded-by association is deleted: 

DelX(id, version), id^tX, DelRdda, idb, version), (ida, idb) R 
The version numbers and transitions are stored as: 



which is the version space V = (VX, VR), a topological data type for spaces of 
arbitrary combinatorial dimension. Since the version graph is a directed acyclic 
graph, the consistency rule for the version space is that of a Xo-space. 

The topological merge of versions (Ai, R\), (X2, R2) which are each modifi- 
cations of a common space (A, R) is a space (Y, S) fitting into the diagram: 
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The set Y consists of all points of X occurring in X\ or X2 , together with any 
additional points in either version. A conflict can occur if a point y is introduced 
in version X\, and also in Xi - In that case, y will have the same id- value in both 
versions, and the conflict does occur if some attribute other than the version- 
value does not coincide in both versions. We call this an inherent conflict. In the 
same way, the topology-defining relation S is given as the set of all pairs from 
R occurring in R\ or R2, together with all new pairs from R\ or i?2- Again, an 
inherent conflict occurs for pairs (a, b) appearing in both relations i?i and R2, if 
there exist attributes other than version and id which are different. 

If there are no inherent conflicts, then the merge (Y, S) is a valid topological 
data type. However, there is a potential source for another type of conflict which 
we call consistency conflict. This happens if the topological space defined by 
(Y, S) violates some pre-assigned consistency rule. In the case of a conflict, a 
warning statement is issued, and it is up to the user to resolve the matter. 

The merge of texts can be viewed as a topological merge by viewing a string 
as a linear DAG • — ^ • — > • whose semantic attribute takes values 

in an alphabet. If the resulting space is again a linear DAG without inherent 
conflict, then the topological merge of texts is valid. However, even without 
inherent conflict one can obtain topological spaces not representing text. E.g. 
the topological merge of texts in Fig. |4] exhibits this consistency conflict. 



help 
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\ e P 

x 2 6 

hello h/ \ Z / 

l^-2^3^4^~5 1 3 



7 5 



halo 
1^-7^-3^5 



Fig. 4. A topological merge of texts violating the consistency rule "linear DAG" . 



Observe that the merge of two versions does not depend on the common 
source. The latter is only needed if the two versions are to be constructed from 
the modifications of the common source. But once the two spaces are known, 
it is only a matter of deciding which points and bounded-by associations they 
have in common, and where they supplement or possibly contradict each other, 
in order to obtain the topological merge. 

As backtracking is the only way to produce existing versions, it follows that 
every version space is a T -space. We allow any finite T -space as a possible 
version space. In particular, a version space need not have a unique starting 
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point, or even be connected. This allows to begin with different parts of some 
spatial model. Each are successively modified, modifications are merged, and 
in the end one unique realisation is obtained through a final merge. In every 
step, a valid topological data type is obtained, and at each merge the occasional 
inherent and consistency conflict is resolved whenever it occurs. 

To store a DAG as a topological data type comes natural. An alternative 
would be a one-dimensional simplicial complex. However, this increases the size 
by the relation which associates edges with boundary vertices, plus the additional 
orientation information of edges. Hence, increasing the dimension does in fact 
lead to a decrease in complexity. 

Find all versions which were modified in order to obtain a given version. This 
queries for all elements x in the version space V = ( VX, VR) from which a 
directed path leads to given version v. These form the minimal neighbourhood 

U v := {xe VX | (x,v) G VR*} , (2) 

where VR* is the reflexive and transitive closure of VR. This set is obtained by 
the following a simple relational query which for convenience is given in SQL: 

select fromv as version 
from poVR 

where poVR. tov = v; 

where poVR, again, is the view on VR* as described in Sec. [3] This minimal 
neighbourhood yields precisely the version numbers needed in order to build up 
the space belonging to v from the initial version(s). 

Do given versions and their modifications reconstruct a given version? The ques- 
tion is whether all paths from the initial versions Vq C V to the given version 
»6F pass through the given set W C V of versions. The answer is given by the 
following consideration: The minimal neighbourhood Ua of a set A C V is 

U A := {x G VX\ 3a G A: (x,a) G VR*} , (3) 
and is the union of all paths ending in A. The paths going out of A are given by 

cl(A) := {x G VX\ 3a G A: (a,x) G VR*} (4) 

which is the closure of A in V . Hence, the paths from Vq to v are given by 

P(V ,v):=U v ncl(Vo), (5) 

and the paths passing through W are all contained in P{W) :— Uw U cl(W^) , 
from which we infer that the answer is given by the query: 

Is it true that P(v , v) C P(W) ? (6) 

This breaks down to queries for minimal neighbourhoods and closures, composed 
by union, intersection and inclusion queries. The relational query for the minimal 
neighbourhood of a set A is a slight modification of that for points: 
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select version from VX 
where version in ( 

select fromv 

from poVR 

where tov in (select id from A)); 

The closure cl(A) also uses the partial order poVR, and is achieved by reverses 
the order of pairs, hence simply by swapping the attributes tov and fromv, in the 
above SQL-statement. The relational query © can now be easily formulated, 
but we refrain from writing its precise statement. 

6 Levels of Detail 

It is often important to have spatial data at different levels of detail. In order 
to enable queries across different levels of detail it is necessary to link objects in 
one LoD with their aggregate object in the next LoD. As all objects in the finer 
LoD have a unique counterpart in the coarser LoD, we have a generalisation 
function g: X — > Y from the fine space X to the coarse space Y. One important 
consistency rule is that objects that are "close" to each other must generalise 
to objects which are also "close" to each other. This "closeness" is a topological 
notion and determined by the bounded-by relation defining the topology of a 
finite space. Respecting that relation is nothing but to require that g be con- 
tinuous p. 508]. More precisely, g is a continuous function between spaces 
(X, R) and (Y, S) if every bounded-by association for X is mapped to a (possibly 
indirect) bounded-by association in Y: either g(x\) = #(#2) or 

Oi,^) eR =>3yi,...,y m eY: (g(xi) ) yi),...,(y m) g(x 2 )) e S 

This rule is equivalent to the usual definition of continuous function from topol- 
ogy [HI §4]: namely that the pre-image of an open set be open [16j Ch. I §5]. 
Notice that the subspace topology from the last paragraph of Sec. 0] is defined 
in such a way that the inclusion function is continuous. 

Another consistency rule is that g be surjective. Then every object in the 
coarse space is indeed the generalisation of an object in the fine space. 

For the classical model 

Solid -)• Face -)• Edge -)• Vertex (7) 

a continuous function g for generalisation purposes implies the explicit modelling 
of up to 16 = 4 2 possible types of association pairs, because g can map one class 
to any other class, and there is no reason to forbid the mapping of one class 
to a certain other class. Extending this to spaces of arbitrary dimension n (e.g. 
space-time etc.) gives (n + l) 2 different LoD associations to be modelled explic- 
itly in order to form one single generalisation function. So, the classical model 
considerably increases the complexity for describing functions. But if LoD 
associations are modelled on a common class of primitive objects, then the com- 
plexity of the class model decreases substantially. In the UML diagram of Fig. 
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this class is called SpatialObject. Instead of several dimension-dependent 
bounded-by associations there is simply one generic BoundedBy association be- 
tween arbitrary objects of a finite space in the most general (topological) sense. 
Functions are incorporated as a GeneralisesTo association respecting the con- 
sistency rule: "continuous functions that are surjective between two consecutive 
LoDs" . Of course, further consistency rules can be imposed if necessary. 



BoundedBy 



SpatialObject 



GeneralisesTo 
{surjective, 
continuous} 



Fig. 5. The family of surjective and continuous GeneralisesTo-functions and the 
BoundedBy-relation on objects. 



Assume that it is of interest if there is a path from a to b inside a given 
subset A of a space X with a generalisation function g : X — > Y. It would be 
economic to transport the question to Y and ask for a path from 17(a) to g(b) 
inside the set g(A), and then infer back to X. If A is connected, the answer is 
"yes". It is known that if A is connected, then g(A) is also connected [14, Ch. 
7, Thm. 1]. In case the pre- image of every connected subset of Y is connected, g 
is called monotonic [191 Ch. V, §46]. For monotonic generalisation functions it 
follows that if A is the full pre- image of g(A), then the path query a ~~> b inside 
A can be reduced to the path query g(a) g(b) inside g{A). In general, one has 
A C g^ 1 (g(A)), so that a positive answer for g(a) g(b) yields the existence of 
paths a ~* b inside g~ 1 (g(A)) which need to be checked upon leaving A or not. 
Of course, by continuity, a negative answer for g(a) ~* g{b) inside g(A) implies 
a negative answer for a ~~» b inside A. The upshot is that the consistency rule 
"monotonic" allows to use g for accelerating the connectivity query for subsets A 
which are pre- images of generalised sets by delegating the query to the set g(A) 
which in general is a smaller data set than the original A. Hence, the following 
filtering approach can be applied: 

Algorithm 1 (Path query) Input. Monotonic generalisation function 
g: X ->• Y, A C X, and a,b e A. 

Result. "Yes" if a ~» b in A, otherwise "No". 

Step 1. Compute g(a),g(b),g(A). Determine, ifBg(a) g(b) inside g (A). 
Step 2. If "No", then answer — No. Otherwise, determine if g~ 1 (g(A)) C A. 
Step 3. If "Yes", then answer — Yes. Otherwise, determine if 3 a ^> b inside A. 
Step 4. If "Yes", then answer — Yes. Otherwise, answer = No. 
Output, answer. 
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Fig. 6. A generalisation to a non-classical space. 



Example. Assume the generalisation of a region subdivided into polygons A, B, C, 
as depicted in Fig. [6] There is a monotonic generalisation function to A', B' , C. 
Now polygon C generalises to vertex C in the interior of polygon B' . However, 
the classical model ([7]) does not allow vertex C to be in a direct bounded-by 
association with a face. Consequently, the information that there is a path from 
anywhere in A! to C" inside the generalised region cannot be inferred from the 
classical model without resorting to the underlying geometry. This has grave 
consequences. Namely, the geometry and topology of the generalised region are 
by force inconsistent: geometry says that C lies inside the face B' , but the clas- 
sical model forces C to be topologically disconnected from B' . Consequently, 
the generalisation function is not continuous! And since the statement: U B' is 
bounded by C" cannot be explicitly modelled, it has to be inferred from the 
position of C. But then, a possible position error of C causing this vertex to be 
outside B' cannot be corrected by the topological model. But in that case, the 
incorrect geometry would be consistent with the classical topology model . . . 

This is remedied by topological data types. A direct bounded-by relation R 
associates B' with the four lines adjacent to B' which are themselves, through 
R, bounded by the four corners of B' which are not bounded by other objects. 
Hence, the latter are vertices, implying that B' must be a face. And R associates 
B 1 with C which in turn is not bounded by an object. Hence, C is a vertex at the 
boundary of object B' . And now the generalisation function is continuous, and 
furthermore monotonic. So, the question whether there is a path from anywhere 
in A to C can be delegated to the simpler generalised region. 

We now show how interpolation between different LoDs is made possible. 
Assume that there is a chain X\ X m of generalisation functions, and 

each space Xi is embedded in some 1™. We assign to each Xi an extra level- 
coordinate i G R, leading to an embedding into R n+1 = R" x R, and each LoD 
i sits inside some slice 1" x {i}. The generalisation function gi associates each 
x G Xi with some gi(x) E Xi+%. If x and Qi(x) are vertices, then they have, by 
our model, coordinates in R™ +1 , and an interpolating function between x and 
y can be defined. So, by assigning to every element x of every Xi coordinates 
(e.g. by taking a representative in the interior of the geometric realisation of 
x), it becomes possible to define interpolation functions between any object 
x and gi{x). The interpolation function can be given as a family of functions 
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f x : [0,1] -> R n+1 with f x (0) = x and f x (l) = g l (x) 1 and as long as t G [0,1) 
the objects f x (t) are considered a placeholder for a; inside X, but when t = 1 
they become generalised to f x (l) = 9i{x). In this way, the topology of Xj stays 
unchanged while i G [0,1) and becomes that of Xj+i as soon as i = 1. The 
geometry can be made to gradually shrink from one LoD to the next. By giving 
coordinates to all elements, and not only the vertices, it becomes possible in 
a geometric realisation of a continuous zoom, for each vertex to have a unique 
trajectory, and so a unique position inside each slice between two LoDs. This 
includes also those vertices generalising to elements which are not vertices. In 
fact, unique trajectories become possible for all elements through the positions 
of their geometric representatives (i.e. coordinates) in 

From above, we derive the following relational schema for generalisations 
which allow interpolations between LoDs. Notice that the table Vertex from 
Sec. [3] is now called Point, as it may contain elements which are not vertices. 

X(id, lod , gid, glod, atts), (gid, glod) — > X 
R( ida , idb , lod), (ida, lod) X, (idb, lod) X 
Point (pid, lod, x,y, z,t), (pid,lod) —> X 

Here, (gid, glod) C — ^ X denotes a continuous foreign key, i.e. a foreign key 
which defines a continuous function from the set of all tuples in X having 
(gid, glod) ^ (NULL, NULL) to X with respect to the topology generated by R. 

The attribute R.lod in both foreign keys from R to X gives a disjoint union 
of an indexed family of spaces [16j Ch. I, §3]. This schema models each LoD in its 
entirety, whereas [7] propagates the idea that objects in one LoD not collapsing 
with other objects in the next LoD need not be repeated in the model. It would 
be interesting to compare both approaches. 

7 Integrating the Different Spaces 

Here, we put together the individual schemas for space-time, version and scale 
to one single schema. The different data models arc integrated by collecting all 
tables, incorporating the attributes, and providing the foreign keys. As the LoD- 
attributc is part of the primary key for the point set, all elements (not only 
vertices) can be given space-time coordinates. This leads to the following tables: 

X (id, lod, gid, glod, version, atts), R( ida , idb , lod), Point(pid,lod,x,y,z,t) 
DelX(id, lod,version) , DelRj ida , idb , lod, version), VX( version ), VR( fromv, toy ) 

And the corresponding foreign keys are: 

X. version -^-^ VX, (X. gid, X. glod) C — ^ X 
(R.ida, R.lod) ^ X, (R.idb, R.lod) ^ X 
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(Point.pid, Point.lod) X, (DelX.id, DelX.lod) ^ X 
(DelR.ida, DelR.lod) X, (DelR.idb, DelR.lod) ^ X, 
VR.fromv^VX, VR.tov ^ VX 

All spaces made of polytopes in Euclidean space-time K 4 can be consistently 
modelled with their versionings and generalisations. Although the combinatorial 
dimension can be arbitrary, the element coordinates are x,y,z,t. Introducing 
more coordinates increases the dimension of the embedding space K". Other 
semantic data can be linked to the model by extending the database schema. 

In which versions is there a path a ~* b inside region A ? This query across ver- 
sions and LoDs is answered with the schema above. Assume that a, b 6 X and 
each point oi A C X are given by (id, lod). First, find all versions of A containing 
a and b. As elements enter X in the version given by version, and leave X in the 
version stored in DelX, the versions containing a and b are the intersection of the 
version intervals for the two points. These intervals are determined by the rela- 
tion VR on the version table VX. For each corresponding version-value, there is 
a different version of A as a subspace of (X, R), determined from X, DelX, DelR, 
and for each version containing a and b, the query can be answered as follows. 
If a and b are in the same LoD, then its topology, stored in R, can be used. If 
a and b are in different LoDs, then the continuous foreign key CFK(i?) is used 
for determining paths between those LoDs. If CFK(i?) is monotonic, the query 
can be accelerated by Algorithm [T] This query uses the whole schema, except 
for Point. An explicit formulation in SQL is left to the reader. 

The Integrated 6D+ Space. So far, we have a system of models for "elementary" 
spaces (cf. Sec. [2]). To show how these can be combined into one space, we first 
extract the LoD space (LoDX , LoDR) by two queries that project Xv and Rv, 
a version v of the stored spaces, onto its two lod attributes: 

create view LoDX as 

select lod from Xv union select glod as lod from Xv ; 
create view LoDR as select lod , glod from Xv; 

A query converts this into the ID edge graph (V, RV), V being the "union" 
of LoDX with LoDR after duplicating the identifiers x in LoDX into pairs 
(lod, glod) = (x, x) to make it union compatible with LoDR. The relation RV 
contains ((a, b), (a, a)) and ((a,b), (b,b)) for every pair (a, b) in LoDR. Adding 
the graph Gv of the generalisation function in Xv to Rv, after making them 
union compatible, yields a new topology T(Gv U Rv). By an equi join of spaces 
(Xv, Gv U Rv) and (V, RV) on Xv.lod and V.lod, we get an equi-join space, or a 
topological pullback [12l p. 406], of dimension > 5. This is also a combinatorial 
variant of the mapping telescope |12[ p. 312]. However, for each LoD i, except 
the coarsest, it contains two redundant copies of one space: the space at LoD 
vertex (x, x) and a homeomorphic copy at LoD edge (x, g(x)). So, the integrated 
5D+ space has redundant information, which is not unusual with table joins. 
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One task of database design is to normalise by factoring redundant tables into 
smaller tables. Similarly, it is also be possible to integrate the whole versioning 
history and the LoDs, which gives an even bigger 6D+ space with even more 
redundancies. For the above schema this means: Whereas it is possible to formu- 
late a query that integrates all spaces into one huge space, this space will have 
anomalies and thus may serve as a view for integrated space-time- version- view 
queries. However, it is not suitable as a relational schema for storage because 
of its anomalies. In short: We propose the future development of a topological 
relational database design theory extending its relational counterpart that has 
proved so successful in recent years. 

8 Implementation 

In Sec. HJ the subset Xt at a time point t of a set X in space-time was obtained 
by a relational selection. This was converted into a topological subspace. In fact, 
all basic query operators of relational algebra [8] can be turned into topological 
relational query operators operating on spaces and returning spaces, analog to 
their relational counterparts, as demonstrated in [9]. Here, we shortly describe 
the prototype on pavel . gi k.kit . edu[ currently designed as a first experimental 
implementation of the semantics of the query operators. 

There are two classes of topological constructions P31 Ch. 3] : the initial (or 
"induced") spaces and the final (or "co- induced" ) spaces. A relational query 
operator on some input spaces operates on their elements and returns a set 
X. Now the result tuples are linked with the spatial entities from the input 
by functions: either from X back to the input (intersection, selection, or join), 
giving an initial space, or the function maps input entities to X (like union, 
projection), giving a final space. 

The prototype is programmed in Common Lisp, and has its own simplified re- 
lational algebra in Lisp syntax. A space can be defined by the space constructor 
whose input is a set, the topology-defining relation and the two foreign keys. Each 
basic operator op has its spatial counterpart op-space, like nat join-space, or 
project-space, that acts on the sets, constructs the corresponding (initial or 
final) topology for the result set and returns a space. These operators can be 
arbitrarily nested. The experiences gained will help to produce a topological re- 
lational database management system which should provide the topological data 
modelling presented above as builtin feature. It would also provide topological 
consistency rules, and could be a starting point for the discussion of topological 
data modelling rules towards a topological data modelling theory extending the 
current relational modelling theory. 

9 Conclusion 

A relational database schema based on Alexandrov topology, that seamlessly 
integrates 4D space-time data, version histories, and different levels of detail 
(LoD), is presented. Such a topology can always be represented by a directed 
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acyclic graph, and it imposes fewer restrictions than the canonical Solid- Face- 
Edge- Vertex-model for spatial data. The gained flexibility and simplicity alle- 
viates more sophisticated spatial data modelling, and endows spatial data with 
an Alexandrov topology, which has practical consequences: As topology is fun- 
damental it is likely to have more to offer for spatial data modelling than is 
momentarily used. A first contribution is a precise definition of "spatial data 
dimension" . A topological version space allows the recovery of different versions 
of a spatial model by using queries based on topological constructions. Among 
the new consistency rules, "continuity" of foreign keys allows consistent mod- 
elling in different LoDs, and "monotonicity" allows accelerated path queries. 
Also, topological queries across versions and LoDs are enabled. 
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